Goto

Collaborating Authors

 dynamic routing



Dynamic Routing Between Capsules

Neural Information Processing Systems

A capsule is a group of neurons whose activity vector represents the instantiation parameters of a specific type of entity such as an object or object part. We use the length of the activity vector to represent the probability that the entity exists and its orientation to represent the instantiation parameters.


A Control-Theoretic Approach to Dynamic Payment Routing for Success Rate Optimization

Agrawal, Aniket, Patil, Harsharanga

arXiv.org Artificial Intelligence

This paper introduces a control-theoretic framework for dynamic payment routing, implemented within JUSPAY's Payment Orchestrator to maximize transaction success rate. The routing system is modeled as a closed-loop feedback controller continuously sensing gateway [3] performance, computing corrective actions, and dynamically routes transactions across gateway to ensure operational resilience. The system leverages concepts from control theory, reinforcement learning, and multi-armed bandit optimization to achieve both short-term responsiveness and long-term stability. Rather than relying on explicit PID regulation, the framework applies generalized feedback-based adaptation, ensuring that corrective actions remain proportional to observed performance deviations and the computed gateway score gradually converges toward the success rate [2]. This hybrid approach unifies control theory and adaptive decision systems, enabling self-regulating transaction routing that dampens instability, and improves reliability. Live production results show an improvement of up to 1.15% in success rate over traditional rule-based routing, demonstrating the effectiveness of feedback-based control in payment systems.


Re-Representation in Sentential Relation Extraction with Sequence Routing Algorithm

Bahrami, Ramazan Ali, Yahyapour, Ramin

arXiv.org Artificial Intelligence

Sentential relation extraction (RE) is an important task in natural language processing (NLP). In this paper we propose to do sentential RE with dynamic routing in capsules. We first show that the proposed approach outperform state of the art on common sentential relation extraction datasets Tacred, Tacredrev, Retacred, and Conll04. We then investigate potential reasons for its good performance on the mentioned datasets, and yet low performance on another similar, yet larger sentential RE dataset, Wikidata. As such, we identify noise in Wikidata labels as one of the reasons that can hinder performance. Additionally, we show associativity of better performance with better re-representation, a term from neuroscience referred to change of representation in human brain to improve the match at comparison time. As example, in the given analogous terms King:Queen::Man:Woman, at comparison time, and as a result of re-representation, the similarity between related head terms (King,Man), and tail terms (Queen,Woman) increases. As such, our observation show that our proposed model can do re-representation better than the vanilla model compared with. To that end, beside noise in the labels of the distantly supervised RE datasets, we propose re-representation as a challenge in sentential RE.



Semi-Gradient SARSA Routing with Theoretical Guarantee on Traffic Stability and Weight Convergence

Wu, Yidan, Yu, Yu, Zhang, Jianan, Jin, Li

arXiv.org Artificial Intelligence

We consider the traffic control problem of dynamic routing over parallel servers, which arises in a variety of engineering systems such as transportation and data transmission. We propose a semi-gradient, on-policy algorithm that learns an approximate optimal routing policy. The algorithm uses generic basis functions with flexible weights to approximate the value function across the unbounded state space. Consequently, the training process lacks Lipschitz continuity of the gradient, boundedness of the temporal-difference error, and a prior guarantee on ergodicity, which are the standard prerequisites in existing literature on reinforcement learning theory. To address this, we combine a Lyapunov approach and an ordinary differential equation-based method to jointly characterize the behavior of traffic state and approximation weights. Our theoretical analysis proves that the training scheme guarantees traffic state stability and ensures almost surely convergence of the weights to the approximate optimum. We also demonstrate via simulations that our algorithm attains significantly faster convergence than neural network-based methods with an insignificant approximation error.


Reviews: Self-Routing Capsule Networks

Neural Information Processing Systems

Post-rebuttal: I have considered the opinion and viewpoint of the other reviewers, who have both provided some good insight on the paper. I have also read the response of the authors very carefully, which has provided some more information. I am happy to revise my score reflecting the new evidence authors have provided. In that sense an expert is specialising in a different region of the input space, whose contributions are adjusted differently per example/input. What happens in the Dynamic Routing and EM is that the agreement between a higher level and lower level capsule is paramount for deciding if something is present in an image or which information to keep based on a voting process.


Reviews: Dynamic Routing Between Capsules

Neural Information Processing Systems

Overview:this paper introduces a dynamic routing process for connecting layers in a feedforward neural net, as described in Procedure 1 on p 3. The key idea here is that the coupling coeff c_ij between unit i and unit j is computed dynamically (layerwise), taking into account the agreement between the output v_j of unit j, and the prediction from unit i \hat{u}_{{j i}. This process is iterates between each layer l and l 1, but does not (as far as I can tell) spread further back. Another innovation used in the paper is a form of nonlinearity as in eq 1 for units which uses the length of the capsule output v_j to encode strength of activity, and the direction of v_j to encode the values of the capsule parameters. A shallow CapsNet model is trained on MNIST, and obtains very good performance (a check of the MNIST leaderboard shows best performance of 0.23 obtained with a committee of deep conv nets), cf performance in Table 1. I regard this paper as very interesting, as it has successfully married the capsules idea with conv nets, and makes use of the dynamic routing capabilities.


PDR-CapsNet: an Energy-Efficient Parallel Approach to Dynamic Routing in Capsule Networks

Javadinia, Samaneh, Baniasadi, Amirali

arXiv.org Artificial Intelligence

Convolutional Neural Networks (CNNs) have produced state-of-the-art results for image classification tasks. However, they are limited in their ability to handle rotational and viewpoint variations due to information loss in max-pooling layers. Capsule Networks (CapsNets) employ a computationally-expensive iterative process referred to as dynamic routing to address these issues. CapsNets, however, often fall short on complex datasets and require more computational resources than CNNs. To overcome these challenges, we introduce the Parallel Dynamic Routing CapsNet (PDR-CapsNet), a deeper and more energy-efficient alternative to CapsNet that offers superior performance, less energy consumption, and lower overfitting rates. By leveraging a parallelization strategy, PDR-CapsNet mitigates the computational complexity of CapsNet and increases throughput, efficiently using hardware resources. As a result, we achieve 83.55\% accuracy while requiring 87.26\% fewer parameters, 32.27\% and 47.40\% fewer MACs, and Flops, achieving 3x faster inference and 7.29J less energy consumption on a 2080Ti GPU with 11GB VRAM compared to CapsNet and for the CIFAR-10 dataset.


XnODR and XnIDR: Two Accurate and Fast Fully Connected Layers For Convolutional Neural Networks

Sun, Jian, Fard, Ali Pourramezan, Mahoor, Mohammad H.

arXiv.org Artificial Intelligence

Capsule Network is powerful at defining the positional relationship between features in deep neural networks for visual recognition tasks, but it is computationally expensive and not suitable for running on mobile devices. The bottleneck is in the computational complexity of the Dynamic Routing mechanism used between the capsules. On the other hand, XNOR-Net is fast and computationally efficient, though it suffers from low accuracy due to information loss in the binarization process. To address the computational burdens of the Dynamic Routing mechanism, this paper proposes new Fully Connected (FC) layers by xnorizing the linear projection outside or inside the Dynamic Routing within the CapsFC layer. Specifically, our proposed FC layers have two versions, XnODR (Xnorize the Linear Projection Outside Dynamic Routing) and XnIDR (Xnorize the Linear Projection Inside Dynamic Routing). To test the generalization of both XnODR and XnIDR, we insert them into two different networks, MobileNetV2 and ResNet-50. Our experiments on three datasets, MNIST, CIFAR-10, and MultiMNIST validate their effectiveness. The results demonstrate that both XnODR and XnIDR help networks to have high accuracy with lower FLOPs and fewer parameters (e.g., 96.14% correctness with 2.99M parameters and 311.74M FLOPs on CIFAR-10).